Improving Dependency Parsers using Combinatory Categorial Grammar
نویسندگان
چکیده
Subcategorization information is a useful feature in dependency parsing. In this paper, we explore a method of incorporating this information via Combinatory Categorial Grammar (CCG) categories from a supertagger. We experiment with two popular dependency parsers (Malt and MST) for two languages: English and Hindi. For both languages, CCG categories improve the overall accuracy of both parsers by around 0.3-0.5% in all experiments. For both parsers, we see larger improvements specifically on dependencies at which they are known to be weak: long distance dependencies for Malt, and verbal arguments for MST. The result is particularly interesting in the case of the fast greedy parser (Malt), since improving its accuracy without significantly compromising speed is relevant for large scale applications such as parsing the web.
منابع مشابه
Improving Combinatory Categorial Grammar Parse Reranking with Dependency Grammar Features
This paper presents a novel method of improving Combinatory Categorial Grammar (CCG) parsing using features generated from Dependency Grammar (DG) parses and combined using reranking. Different grammar formalisms have different strengths and different parsing models have consequently divergent views of the data. More specifically, dependency parsers are sensitive to linguistic generalisations t...
متن کاملReranking a wide-coverage ccg parser
n-best parse reranking is an important technique for improving the accuracy of statistical parsers. Reranking is not constrained by the dynamic programming required for tractable parsing, so arbitrary features of each parse may be considered. We adapt the reranking features and methodology used by Charniak and Johnson (2005) for the C&C Combinatory Categorial Grammar parser, and develop new fea...
متن کاملBuilding Deep Dependency Structures using a Wide-Coverage CCG Parser
This paper describes a wide-coverage statistical parser that uses Combinatory Categorial Grammar (CCG) to derive dependency structures. The parser differs from most existing wide-coverage treebank parsers in capturing the long-range dependencies inherent in constructions such as coordination, extraction, raising and control, as well as the standard local predicate-argument dependencies. A set o...
متن کاملConverting a Dependency Treebank to a Categorial Grammar Treebank for Italian
The Turin University Treebank (TUT) is a treebank with dependency-based annotations of 2,400 Italian sentences. By converting TUT to binary constituency trees, it is possible to produce a treebank of derivations of Combinatory Categorial Grammar (CCG), with an algorithm that traverses a tree in a top-down manner, employing a stack to record argument structure, using Part of Speech tags to deter...
متن کاملBuilding Deep Dependency Structures with a Wide-Coverage CCG Parser
This paper describes a wide-coverage statistical parser that uses Combinatory Categorial Grammar (CCG) to derive dependency structures. The parser differs from most existing wide-coverage treebank parsers in capturing the long-range dependencies inherent in constructions such as coordination, extraction, raising and control, as well as the standard local predicate-argument dependencies. A set o...
متن کامل